1 |
Automatic annotation of context and speech acts for dialogue corpora
|
|
|
|
Abstract:
Richly annotated dialogue corpora are essential for new research directions in statistical learning approaches to dialogue management, context-sensitive interpretation, and context-sensitive speech recognition. In particular, large dialogue corpora annotated with contextual information and speech acts are urgently required. We explore how existing dialogue corpora (usually consisting of utterance transcriptions) can be automatically processed to yield new corpora where dialogue context and speech acts are accurately represented. We present a conceptual and computational framework for generating such corpora. As an example, we present and evaluate an automatic annotation system which builds ‘Information State Update' (ISU) representations of dialogue context for the Communicator (2000 and 2001) corpora of human-machine dialogues (2,331 dialogues). The purposes of this annotation are to generate corpora for reinforcement learning of dialogue policies, for building user simulations, for evaluating different dialogue strategies against a baseline, and for training models for context-dependent interpretation and speech recognition. The automatic annotation system parses system and user utterances into speech acts and builds up sequences of dialogue context representations using an ISU dialogue manager. We present the architecture of the automatic annotation system and a detailed example to illustrate how the system components interact to produce the annotations. We also evaluate the annotations, with respect to the task completion metrics of the original corpus and in comparison to hand-annotated data and annotations produced by a baseline automatic system. The automatic annotations perform well and largely outperform the baseline automatic annotations in all measures. The resulting annotated corpus has been used to train high-quality user simulations and to learn successful dialogue strategies. The final corpus will be made publicly available
|
|
URL: http://doc.rero.ch/record/304005/files/S1351324909005105.pdf
|
|
BASE
|
|
Hide details
|
|
6 |
The User Model-Based Summarize and Re_ne Approach Improves Information Presentation in Spoken Dialog Systems
|
|
|
|
In: ISSN: 0885-2308 ; EISSN: 1095-8363 ; Computer Speech and Language ; https://hal.archives-ouvertes.fr/hal-00692184 ; Computer Speech and Language, Elsevier, 2010, 25 (2), pp.175. ⟨10.1016/j.csl.2010.04.003⟩ (2010)
|
|
BASE
|
|
Show details
|
|
7 |
BEETLE II: A System for Tutoring and Computational Linguistics Experimentation
|
|
|
|
In: DTIC (2010)
|
|
BASE
|
|
Show details
|
|
8 |
The Impact of Interpretation Problems on Tutorial Dialogue
|
|
|
|
In: DTIC (2010)
|
|
BASE
|
|
Show details
|
|
9 |
A Study of Feedback Strategies in Foreign Language Classrooms and Tutorials with Implications for Intelligent Computer-Assisted Language Learning Systems
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Automatic Annotation of Context and Speech Acts for Dialogue Corpora.
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Generating Tailored, Comparative Descriptions with Contextually Appropriate Intonation
|
|
|
|
BASE
|
|
Show details
|
|
12 |
The MATCH Corpus: A Corpus of Older and Younger Users' Interactions With Spoken Dialogue Systems.
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Metacognitive Awareness versus Linguistic Politeness: Expressions of Confusion in Tutorial Dialogues
|
|
|
|
In: DTIC (2009)
|
|
BASE
|
|
Show details
|
|
15 |
Automatic annotation of context and speech acts for dialogue corpora
|
|
|
|
In: ISSN: 1351-3249 ; Natural Language Engineering, Vol. 15, No 3 (2009) pp. 315-353 (2009)
|
|
BASE
|
|
Show details
|
|
16 |
Evaluating Information Presentation Strategies for Spoken Dialogue Systems
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Personality and alignment processes in dialogue: towards a lexically-based unified model
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Context Effects in Language Production: Models of Syntactic Priming in Dialogue Corpora
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Evaluating the impact of variation in automatically generated embodied object descriptions
|
|
|
|
BASE
|
|
Show details
|
|
|
|